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ABSTRACT 

This report describes a test of the robustness of 
factor-analytic methods in the face of various types of scale 
transformations on the data* Because of the complexities that would 
be involved in an exact analytical investigation, the tests were done 
with simulated sets of data having different factor structures. After 
factor analyzing the original data sets, scale transformations were 
done, and the transformed data sets were factor analyzed. Comparisons 
made between results obtained before and after transformations lead 
to the conclusion that monotonic transformations do not alter the 
results, while nonmonotonic transformations may. Because the 
comparisons were made with only a small number of data sets, it is 
suggested that special choices of data, factor-analytic methods, or 
scale transformations may limit the validity of this conclusion. 
(WDP) 
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Data with different factor structures are generated and analyzed. 
The variables are transformed and reanalyzed and comparisons 
between factor analyses before and after transformation are made* 
All comparisons indicate the same conclusion: monotonic transforma- 
tions do not change the results* while non-monotonic transformations 
may , Special choices of data, factor •analytic method, transforma- 
tions and ways of comparison may limit the validity of this conclusion. 
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INTRODUCTION 

Educational research uses many concepts which are not unequivocally 
defined. This has involved different variables to nneasure the (nominally) 
same property. Relations betw^jen these are almost always stochastic, 
ranging from complete independence to the maximal correlation "allowed'^ 
by their reliablUties, Since several variables, proposed to measure the 
same property, are seldom congeneric and often not even isomorphic in 
their specific true scores, they can hardly be said to measure tlie same 
property. 

However, from a practical viewpoint, it need not be important that 
variables measure exactly the same property. It is more important that 
the^y represent the same property to a sufficient extent. By this I mc^an 
that the same result is obtained by different collections of variables, 
which are considered to measure the same properties, when they are 
used on the same or similar measurement objects. This is a vaguely 
formulated but important principle. Above all it means that a researcher 
draws the same conclusions, generates the same hypotheses and nTakes 
the same decisions, independent of which collection of variables iho results 
are based on. At the present state of educational measurements it is, 
no doubt, of importance to investigate the robustness of results based on 
different collections of variables. Such studies can never be definitive, 
but this report tries to give some results relevant to the question of 
robustness , 

These investigations can be performed in different ways: one can use 
real data or simulated data or one can make a purely analytical (n^alhe- 
r^atical) •^^'^estigation. Real data have the advantage of pernntting Loncrete 
interpretations. The drawback is that you, as a rule, m>ist take ciata 
already collected, which often are designed for quite another puvpost*. 
It is -considerably simpler to make a systematic investigation by simulatinf^ 
data, since data can be chosen almost without restriction. However, the 
amount of data, which can be reasonably analyzed, limits the pc.ssibilitN' 
of generalizing from simulation experiments. An analytical investigation 
is here superior, since it is not based upon special data. Owing to complex 
problems, analytical investigations are not always feasible* 

I have chosen to study the influence of some transformations or. {nctor- 
analytical results. As an example, suppose that results of factor .iriaiysis 
are robust to monotonic transformations. It would then seem to me, tli.ii tlie 
scale probiCin of the instruments chosen is not very urgent, provided that 
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the properties are defined sufficiently well to determine ordinal measure- 
ments. Strictly speaking, this study investigates only the robustness of 
allocating different numbers to the possible outcomes of a certain collection 
of instruments. But since functions can approximate stochastic relations 
this report also mirrors, more or less, the robustness of results based on 
different collections of instruments. 

The present report comprises simtilations only. In my opinion, it 
would have been better to make an analytical investigation. However, the 
complexity of the problems is clearly too great for me - I do not even 
know how they should be formulated. Simulation is a solution which can be 
resorted to when an analytical investigation does not seem possible. The 
results of this report therefore constitute no rigid proof either for or 
against factor-analytical robustness to transformations: they can only make 
it more or less credible. 

DESIGN OF THE EXPERIMENT 
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Thurstone (1947, p. 369) says that comparisons have shown that different 
moncEonic transformations give essentially the same factor structure, 
when this is a simple structure. However, he does not show any results* 
His statement is, in a sense, corroborated by his box example, which 
Cctn be found in several places in his book. If one knows that a collection 
of variables satisfies the linear factor model, then monotonic transforma- 
tions of these variables cannot satisfy the tnodel in the same way» The 
variables of the box example are different measuremr^nts of box^s, which 
often are non-linear functions of height, length and breadth. In spite of 
this, factor loadings of the rotated factors give support for the above* 
mentioned dimensions. Thus, there are certain reasons to assume the 
robustness of factor analysis, at least to monotonic transformations. 

As a further support for the same presumption, one may add the 
following simple, analytic result. Suppose thaty. and y^ have a hivariate 
normal distribution with expected values and variances and cr^ 
and correlation . Then 



and 

(2) eylvl 



Here V = <r/|i, the coefficient of variation, and 5 2 shall have the same 

sign as £ . As T7 usually lies in the interval (0. 0, 0, 5) for variables 
within behavioural research, a quadratic transfornnation has little effect 
on 6. . One dare assume that such a transformation hardly changes results 

of factor analysis on approximately normally distributed variables. It 

2 

should be pointed out that y is here not a strictly monotonic transformation, 
since - < y < , The extreme case = |^(y - H^IAJ ^ is clearly non- 
monotonic and formulas I and 2 show that 6 2 :r 0, independent of 6, 

2 ''l^'2 
and 2 2 = ci /This indicates that non-monotonic transformations 

drastically can change the factor structure, 

A number of collections of variables vdth known factor structures have 
been generated. Then these have been transformed in different ways and 
factor analyses before and after transformation are compared in some aspects 
The original variables have been generated by the following factor model: 
m 

(3) Yi = 1 + ^ ' ' ^' P- 

k= t 

Here y. is a manifest variable, x, a common factor, e, a unique factor and 
bj^j^ a factor Joading* The model may now be realized in various ways. I 
have chosen to make and e, (p 4 m in number) independent of each other* 
Also, all x^ are so called stanine variables (approximatively normally 
distributed on the integers l(l)9)» all e^ are rectangularly distributed on 
the interval (0» 2) and b., > 0, This gives all y. > 0, symmetrically distributed 
with negative kurtcsises, a distribution rather common within behavioural 
sciences. The linear correlation between y. and y. is now 

3. 84 r , b., b.. 

\ r>i 1 ik jk 

(4) fiyiYj - — Ji- 

x^(3.84 E bf^ + 0.33) (3.84 )^ 

A ^ value aimed at can be obtained by suitable choices of the factor loadings* 
As b»j^ > 0, 6L becomes positive, and this is a general fact for several 
variable domains. The choice of identical e. implies the rank correlation / 
between communality and variance of y. to be unity. However, I do not 
think that this restriction makes the results less general. 

F collection of variables there remains, among other things, 

the choice of m, p and 2, As the number of factor loadings , which must be 
determined for a given collection, is mp, I have put some restrictions on the 
choice of o because of the amount of work involved: it is fixed to 10 and 30. 



Ten manifest variables constitute a small number as far as factor analyses 
used in behavioural research are concerned, while thirty is a more common, 
though not especially large number. Then m has been chosen to indicate 
either a small or large number of factors: for p = 10, m - I or 4 and for 
p = 30, m ^ 4 or 10. For these four cases I have chosen correlation 
matrices in three ways: 0. 0 < 8 < 0. 4, 0. 0 < £ < 0. 9 and 0. 5 < £ < 0. 9. 

The factor-analytic method used here is the principal axes solution 
with varimax rotation according to the program BMDX72, Dixon (1970). 
I think that this method is too much used. Whether this depends on tradition 
(most earlier analyses built upon the centroid method with graphical rota- 
tions to simple structures » of which the metfiod of BMDX72 is a modern 
variant), easily available program.s or difficulties in understanding newer 
inethods may be left an open question. As is clear from e.g. J5reskog's 
papers the newer, inferential methods ot factor analysis are more stringent 
and flexible than the older ones a.ici wili> ii\ \x\y opinion, dominate future 
uses of factor analysis. Tod-iv tlicv are more or less limited because of 
computers having insufficient ii.r.c-rnal tnumories. For instance, the 
maximal number of variables which can bu analyzed is not seldom too small 
for applications within behavioural science. Inertia of innovation rr^ay be 
added to this: the inferential factor analysis puts some now demands on the 
user. 

Although I could have used programs like ACOVS or LISREL, see e, g, 
Joreskog (1973) and Joreskog van Thillo (1 97 3), BMDX72 has been used/ 
for two reasons. Partly because^ it is easily available but, above all, 
because so many researcher s Lave used it (or more ccrrectly: its parallel 
BMD03M). I do not think that rosiilts would have been essentially different, 
as far as the robustness of the i-slii-r-jates of parameters of the facto?; 
model is concerned, with a mclliod otiier than that of BMDX72. 

The program has been r U.I tv/icc tor every case, partly with 1*0 and 
partly with squared multiple correlations (R'') in the principal diagonal 
of the correlation matrix, Thf re are a total of 24 factor analyses on 
uhtransformed (original) variables, which are deedgned as a 2x2x3x2 factorial 
experiment. How*e ve r , . there a re only 1 2 diffe rent collections of variables 
generated, since the two types of values in tlie principal diagonal ? re used 
for the same generation. Table \ shov/s the design with the numbering of 
cases which will be used when reporting the results. 
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1\'ible 1 . Numbering of the different cases 





p 10 


p = 30 


m 1 


m = 4 


m = 4 


m = 10 






I A 


'iA 


7 A 


lOA 


I. 0 


t B 


4B 


7B 


lOB 


0. 0 < S < 0. 9 




2A 


6A 


8A 


UA 


1 . 0 


?,B 


3B 


8B 


MB 


0. 5 < S < 0. 9 


> 

R" 


3A 


6A 


9A 


1 2A 


1.0 


3B 


6B 


9B 


I 2B 



Formula 4 is exactly valid when we have an infinite number of measurement 
objects and will be approximative when only a small number of objects is 
available. For instance, the factor variance is not exactly 3. 84 and the 
factors are not exactly independent of each other. This implies that you 
cannot, in practice, obtain exactly the ^ values aimed at, e.g. £ > 0,0 
may very well be realized as -0, 1. A compromise has to be made between 
reasonable costs for computer time and a sufficient number of objects to 
approximate the model. Trial runs with 50, 100 and 200 objects showed 
that only 200 objects give acceptable agreements between inodel and data. 
Each of the 24 cases shown in table t is thus based on 200 measurement 
objects, a rather common sample size within educational research. The 
cases have been generated twice in order to get an idea of random variation 
at 200 objects. Information about this variation will be used when reporting 
the result. 

The number of possible transformations is infinite. With regard to 
computer time and the amount of work when comparing factor analyses* 
the number must be strongly limited. My choice is hardly very rational or 
systematic: I do not even know how it could he made so. Positively skewed 
distributions are not unusual and it is sometimes recommended that these 
should be normalized through square root or logarithmic cransformations* 
These functions have been exploited, as well as their inverse functions * 
As an example of a more gene ral m ono tonic transformation, I have used 
rank numbers instead of the original scores. Finally, one non -monotonic 
^^^^^^^^A^^^-:;^^^^^^ t has also been chosen* Such tratTsformations are sometimes 

used, c,g* absolute deviation from an ideal point on a scale. 

The 12 untransformed cases (2 1 factor analyses) have giv^en rise to four 
new transformed sets. For one set, half of the variables (those with odd 
numbers) have been transformed from y into y'^/t 0, while t!ie other variables 
have been transformed from y into /y. For a second set* the corresponding 
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functions are exp (y/6) and In (l+y) and in a third set all variables have 

2 

been ranked. The set involving non-monotoiUc transformations uses ^^-y - 

[(y - "^y)/^yj ^ of the variables of a case and leaves the other half 

unchanged. Every set comprises 24 factor analyses as every case is run 

2 

twice, both with R and 1 , 0 in the principal diagonal of the correlation 
matrix. Thus there are totally 144 factor analyses (the untraiisformed set 
is generated twice). 

Tile way of comparing results of factor analyses is not self-evident. 
What is meant by saying that two analyses give essentially the same answer? 
Which aspects are to be compared and how? An important interpretation of 
"the same answer" is that different researchers understand the factors in 
a similar way. This configurational invariance is in most cases sufficient, 
The drawback of simulated data is tlie impossibility of empirically inter- 
preting factors: data are, so to say, without content. One then has to 
examine numerical invariance by calculating different indices for deviation. 
This is a more rigorous comparison: e. g. numerical invariance of factor 
ioaditigs implies configuvational invariance but the reverse need not be 
true ..." 

Comparisons will be concentrated on eigenvalues: the number of factors 
with eigenvalues above I. 0 (a conimon criterion used when rotating factors), 
the proportion of total variance accounted for by these factors and, above 
all, the distribution of eigenv^alues of unrotated factor s . Comparisotis of 
communalities will also bo comtnented upon, while factor loadings are 
discussed rather little. The otlie r compa risons should still give the reader 
an understanding of the influence of tlie transformations. 



RESULTS 

The numbering of cases which was shown in table 1 is used in the following 
tables. The six sets of cases will be numbe red l>y Roman numerals: I aiid 11 
stand for the untransformeci sets, III concern? the transformations y^/t 0 
and Vy, iV refers to exp (y/6) and In (l+y), V comprises the rank numbers 
and VI denotes the set with non-monotonic transformations. Set I has been 
^ i used when generating the transformed sets. Comparisons between set I and 
sets III, IV, V and VI are theretore the tnosv important ones. Moweveri 
comparisons between I and II give you a hint of the size of random variations 
and can be exploited for discussions of the other comparisons. 

TIk* first .aspect of comparison concerns the aumbe r of unrotated factors 
with eigenvalues greater than 1. 0. This is a -'blind'' criterion which is often, 
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perhaps too often, used when determining the number of factors to rotate. 
If you have no prior idea about the number of interpretable factors, it is 
perhaps wiser to examine more than one solution. It is far from certain that 
a factor with an eigenvalue of 1 . 5 can be given any interpretation, while I 
have sometimes seen how a factor with an eigenvalue below t.O has contribu- 
ted essentially to the understanding of a variable domain. Dempster (1969, 
p. 139) has a similar, though more extreme attitude concering component 
analysis. 

Table 2 gives the numbers of unrotated factors with eigenvalues exceeding 
1. 0 and table '3 shows the proportion of total variance which these factors 
represent. It is clear from these tables that III. IV and V are in very good 
agreement with I, and the difference between I and II is also small. As might 
have been expected, VI shows greater deviation. The number of factors to 
rotate is, with one exception, equal to or greater than that for I, but in spite 
of this fact the proportion is often lesser for VI than for I. Thus, set VI 
has a flatter eigenvalue distribution than I, which is reasonable with regard 
to formulas 1 and 2. However, not even as extreme a transformation as VI 
comprises can be said to produce very great differences. But tables 2 and 3 
present very rough measures: they tell rather little about similarities or 
differences between corresponding factors of two sets. 



Ta bic 2. Number of factors with eigenvalues above 1. 0 



Ca 


Set 


I 


II 


ill 


IV 


V 


VI 


lA 


1 


1 


1 


1 


1 


1 


IB 


3 


3 


, 


3 


3 


4 


2A 


I 


1 


! 1 


\ ^ 


1 


2 


2B 


1 2 


f 

i I 


I- L_ 


1 


3 


3 A 


1 1 


! 1 


1 1 


1 


2 


3B I 1. 1 ! 1 1 


1 


2 


4A ; 1 1 i 1 1 






•iB i 3 4 ] 3 3 


3 


4 


. 5A i 1 1 1 1 




2 


1 5B ■ 2 2 I 2 2 


— i— 

I i 


3 


6A ; 1 1 1 1 


1 


I 2 


6B 1 1 1 1 




2 


'A ; 2 , 2 2 : 2 


2 


— ^ ^ — ^ 

! 2 


7B i 9 : 10 ! 10 9 1 If) 


> 

1 n 


8A : 2 1 2 i 2 j 2 


2 


2 


8B r 5 ■ : 5 1 5 1 5 


5 


7 


9A j 2 ; 2 ' 2 ; 2 


2 


2 


. 9B ' 2 ; . ' 3 1 -2 : 


2 


2 


4 


lOA 1 2 


2 


2 , 


? 


2 


■ 2 . 


tOB 


1 1 


10 


10 


1 1 


11 


11 


1 lA 


3 


4 


3 


3 


3 


: 2 : 


_ IIB 


5 


5 


5 


5 


. 6 


7 




3 


5 


3 


3 


3 


4 


12B 


5 


5 


5 


5 


5 


6 



Tablt'_3_* Proportion of total variance for factors with eigenvalues above 1,0 



Case 



Sot 


I 




III 


IV 


V 


VI 


1 A 


0. 222 


0. 162 ■ 


0. 2 13 


0. 214 


0. 210 


0.124 


^ 


0. 515 


0.471 i 

\ " 


0. 507 


' ■ ' • ■ — -t 

0. 508 ' 


0. 505 


0.553 


2 A 


0. 610 


0.618 


1 

0. 693 


• ' \ 

0. 594 


0. 598 


0. 601 


2 D 


0. 638 


0.742 


0. 622 


0. 625 


0. 626 


0. 690 


3A 


0. 744 


0. 754 


0. 716 


0. 722 


0.721 


0. 640 


3B 


0. 768 


0. 777 j 


0. 742 


0. 748 


0. 745 


0. 710 


4 A 


0.210 


0.205 j 


0. 209 


0. 209 


0. 194 


0. 104 


4 13. 


0. 504 


0,607 ' 


0. 502 


0.503 


0. 489 


0. 544 


5A 


0. 538 


0.53 7 


0, 534 


0, 526 


0. 522 


0. 424 


5B 


0. 679 


0.681 


0. 677 


0.670 


0. 664 


0. 630 


6A 


0. 686 


0, 674 


0,677 


0. 672 


0. 662 


0. 572 


6B 


0. 711 


0, 700 


0. 704 


0. 700 


0. 690 


0. 659 


7A 


0. 2 1 1 


0, 23 0 


0. 209 


0. 21 1 


0. 192 


0. 139 


7B 


0. 63 3 


0. 606 


0.590 


0. 559 


0. 639 


0. 643 


8 A 


0. 586 


: 0, 592 


0. 5T9 


0. 575 


0.664 


0. 465 


8B 


! 0, 727 


■ 0, 73 I 


0.722 


0. 718 


0. 708 


0. 687 


9A 


1 0.741 


■ 0, 712 


0. 734 


0. 727 


0.729 


0. 606 


9 B 


0. 753 


' 0.762 


0.74 6 


0.740 


0. 741 


0.703 


I OA 


0. 212 


0. 235 


0. 210 


0. 211 


0, 195 


0. 129 


lOB 


0,622 


j O. 604 


0. 587 


0. 622 


0. 609 


0. 568 


1 1 A 


0.567 


0. 595 


0. 56l 


0. 555 


0. 533 


0. 429 


1 IB 


0, 674 


; 0. 672 


0. 669 


0. 664 


0.678 


0. 662 


12A 


0, 704 


1 0.764 I 


0. 699 


0. 689 


0. 692 


0. 649 


izn 


0. 799 


|_ 0. 79 7 L 


0. 795 


0. 787 


0.787 


0. 681 



Only the first five eigenvalues of unrotated factors have been e>cploited for 
comparisons of eigenvalue distributions. The deviations of the subsequent 
eigenvalue pairs arc small throughout: the greatest deviation almost always 
belongs to the first eigenvalue pair* The sum of the absolute differences of 
the first five eigenvalue pairs is presented as an index of deviation. The sum 
is then, of t<mrse» an upper limit for individual differences and it has been 
calculated for differences between eigenvalues of set I and those in II, III, IV, 
V and VI. These sun^s are given in table 4 with certain summaries in table 5. 

The transformations of III and IV almost always cause small deviations, 
smaller than those of a new data generation (set II). Set V involves deviations 
^;%f^^^t^^^ as for II, sometimes somewhat smaller and sometimes 
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a bit greater. The non-monotonic transformation of VI on the other hand glv6@ 

rise to great deviations. Those depend mainly on the fact that the first factot 

for every case of I has great loadings on mosrvariables, while, for VL th6 

first factor has only great loadings on y variables and the second factor is 
2 

defined by the z variables. Those variables, which have been transformed 

2, ■ ■ ■ ' 

from y to z^ ^ consequently measure something else now\ 
T able 4^ Eigenvalue deviation from set I 



1 Case 1 

1 1— 1 


Set 


II 


Ill 


i IV 


V 


VI 




0. 724 


0. 145 


0. 115 


r • ' — ■ — ^— 

j 0, 163 

4 — ^ — — ^ 


1. 188 

1 


u \ 


1 

0. 701 


- — 
0, 138 


0, 094 


• 0, 179 


! t.439 


2 A 


0. 174 

■- : 


1 ~ 

0. 244 

— ■ ■ ■■ 


0, 218 


k -1. ii- ■■ .. ..-1.- 

0, 249 


4. 422 




0. 246 


0. 222 


0. 163 


0, 231 


4. 899 




0. 118 


0, 302 


0, 3 18 


0. 303 


6.969 




0. 156 


0, 447 


0. 326 


0. 361 


i 6. 527 


i 4 \ 


0, 222 


0. 043 


0, 024 


0. 202 


1. 299 


j. ...... , ■ — 


0. 160 

1 ■■ 1 , II ■ I.... — — — 


0. 026 


0. 012 


0, 194 


1. 21J 




, 0. 102 


0. 070 


0. 169 


0. 245 


3.970 


i iLZ 


0. 158 


0, 034 


0.112 


. 0, 236 


4. 383 




0, 170 


0, 146 


0, 253 


0. 292 


5. 088 


i 6B 


0. 257 


0, 1 19 


0. 176 


0. 351 


5. 873 


7A 


0. 637 


0. 101 


0. 087 


0, 626 


2. 664 


7B 


0. 587 


0, 083 


0. 075 


0. 638 


2. 802 




0, 653 


0. 243 


0. 367 


0. 722 


12. 620 


1 8B 


0. 309 


0. 252 


0. 345 


0. 724 


12. 504 


1 9A 


1. 316 


0. 271 


0. 598 


0. 509 


16. 464 




1. 464 


0. 260 


0, 476 


0. 499 


16. 997 


lOA 


i.322 


0. 068 


0. 031 


0. 514 


2. 550 


lOB 


1. 338 


0. 080 


0.031 


0, 530 


2, 658 


I lA 


0. 954 


0. 208 




1, 167 


10. 223 


1 IB 


0.74 1 ; 


0. 169 


0.306 


t. 219 


10. 239 


12A 


1. 279 ' 


0, 151 


- - 

0. 4 56 

1 — - » — ■ > . 


0, 482 


11.009 


12B 


1. 261 ! 


0, 146 


i 0. 4 j6 


0, 481 


1 1. 760 









- l:. - 



According to table 5 the average deviation, for a given set, Is the same 
irrespective of whether R or t - 0 has been used. There Is an indication of 
greater random error with inore variables (comparison I-II) and that VI , 
and perhaps also V deviate more for p - 30 than for p = 10. It seems to have 
no importance whether tl>e numbo r of factors is small or great, For III, 
IV and VI, transformations have a greater influence on high than on low ■ 
correlations , whicVi seems reasonable con side ring formulas I and Z. How- 
ever, we must not forget that the transformations of III and IV have almost 
no influence: factor analysis on y or c . g. givo s in principle the same 
■result* 

Table 5, Summary of table -V 



Summary 


1 






Sot 








t 


11 


I III ■ 


4. _ _ — ~ — \ 


V 


VI 


Diagonal 

. ■ . ■ ■ 
* value 




0. 639 


4— — — — 

; 0. 173 


I 0. 255 


0. 456 


6. 


455 


1.0 


0. 615 


i 0. 165 


1 0. 2 13 


0.470 


6. 


774 




10 


0. 266 


-i—— - 
0. 169 


1 0. 165 


0.250 


3. 


856 




30 


0. 988 


-f--^^- 

i 0. 169 


! 0. 303 

U — . J- — ■ — ^- 


0. 676 


9. 


374 




Low 


0. 590 


0. 233 


i 0.265 


0. 434 


7. 


375 


m 


high 


0, 064 


0. 105 


— 1 ' 

' 0. 203 


0. 492 


5. 


855 




low 


0. 7 1 1 


0.086 


i 0. 059 


0. 381 


1. 


976 


\ e 


medium 


0, 418 


: 0. 180 


' 0. 264 


0.599 




908 


i r ■ 


high 


0. 752 


—I - 

' 0. 242 


1 0. 380 


1 0.409 


9. 


961 


1 

Total 




0. 627 

k III 


\ 0, 169 , 


1 0. 234 


1 0.463 


6. 


615 



And now some words about tht^squa red multiple cor relation between y. 

and X, • . • K » the so called ..ommunalitv (P.). It is known (see e.g. 
I m 1 

Ro/cboom, 1966, p. 2b l) that the squared multiple correlation between y/ 
andy^. yj^j> y.^j. y^^ (Pp cannot exceed P., when these 

quantities are based on an irifinitc luimbc r of objects . The relation may be 
another in a sample and I have cxamitu^d whether the sample value of P, , 
which is an estimate of often used, seems to be good for this purpose. 
The communal ity is known tor I and II and I have lucked through set I with ; 
the result that R? presumably cnn bf said to constitute a reasonable estimate i 
oi T-.Few differences are greater than 0. 10 and - ' i greatest for low . 
vaiue>, because tlie .bias Rj' - P^' is then not negligible, 

toVr/;.JA simpl^ 'investigation of whether the transformations change P. has also 

been made* But P is not kriowri for the transformed sets and comparisons jy^^^^ 
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have therefore been made on sample values calculated from the factor 
loading matrix* (The matrix has m factors, except for the cases with 
m = 10, where only five factors have been used,) The same pattern as for 
eigenvalues comes back. The monotonic transformations have hardly any 
influence but the iion-monotonic does. Deviations over 0,05 are rare for III, 
IV and V, while deviations of 0, 20 are not unusual for VI: maximal deviation 
per case varies here between O. iZ and 0.84» 

Though eigenvalues and commonalities do not deviate from each other 
(comparisons I-III, I-IV and I-V) this does not usually imply that factor 
loadings must be similar too. However, a superficial inspection of these 
{for unrotated factors) shows no new picture* Factor loadings of set I are 
for every case similar to corresponding loadings of sets III, IV and V: 
a difference over 0. lO is a rarity. On the other hand, loadings are differently 
structured in VI and great differences are common. There is therefore 
reason to presume that factors of I, III, IV and V - but not of VI - would 
have been interpreted in similar ways if the variables had had some 
empirical anchoring. 

DISCUSSION 

As I sec it, a measurement process consists of three stages : a definition 
of a property, a choice of an instrument and the allocation of numbers 
to the possible outcomes of the instrument. In some areas researchers 
have been able to agree upon a definition of a property so precise that all 
admissible combinations of instruments and numbers determine linearly 
related variables. This is hardly the case in educational research. I believe 
that the definitions are here sometimes so diffuse that possible combinations 
made by researchers, believing in the same definition, generate variables^ 
which a ro not even monotonically related in their specific true scores* That 
is, the difference in one true score variable hu-s not the same sign as the 
difference in another true score variable for any pair of objects, I have^^>^^:v: ; f 
sometimes heard pronouncements like "this is probably only an ordinal 
scale" just as if it should be self-evident that a variable represents a property 
according to the requirements of the ordinal scale. Strictly speaking, the 
pronovmcement is a contradiction e. g. every time a researcher constructs 
two alternative instruments for measuring a property and does not find the 
two true score variables monotonically related, in the sense that both 
instruments do not produce ordhial scales for the same property. 
However, it may well also be so that stochastic relations, winch 



ossentially approximate monotonic relations (e.g. 90 per cent of pairs of 
objects have differences with the same sign for two variables)j are sufficient 
for several analytical purposes But this robustness of results based on 
different collections of variables is something that we know rather little 
about. What we would like to know is in which situations robustness occurs - 
and does not occur. This information could then be used to focus our efforts 
to improve measurements for situations where robustness does not exist. 
We may imagine a two-dimensional contingency table with different properties 
as columns and different statistical methods as rows and where every cell 
can be said to define a situation. My presumption is that some methods are 
more robust than others, e, g/ a linear product-moment correlation seems 
to be much more scale independent than a statistical test about equal 
covariancc matrices. Likewise, properties may also vary in robustness: 
diffusely defined properties generate less robustness because the admissible 
choices of variables are so great. This would mean that some properties may 
be sufficiently well defined for certain methods but not for others. A wise 
selection of methods and properties could form a basis for a research program 
of empirical investigations of robustness. 

My simulation experiment is not tied to certain properties but to one 
method and thus more or less mirrors the conditions for a whole row of the 
above-mentioned contingency table . In that it is presumably more general 
than an empirical investigation. On the other hand, the experiment involves 
the restriction of only examining non- stochastic relation*3> which means that 
it primarily treats the robustness of the third stage ^of the measurement 
process: the allocation of numbers^ given a certain collection of instruments. 
The special choices of factor-analytic method, ways of comparison and trans- 
formations may also restrict the generalizability of this investigation. I will 
briefly comment upon these choices. 

The question as to whether another method of factor analysis would have 
produced different results seems to be difficult to answer. I would like to 
answer in the negative, but this is only what I believo, In my experience, 
several descriptive methods seem to be rather robust to many (but not 
necessarily all) monotonic transformations, while inferential methods need 
not be, More exactly: several estimates are often little dependent on the 
form of the distribution, but the probabilistic evaluation of a statistical test 
iquantity can be very sensitive to different kinds of distribution functions* I 
]tk<&^fetoY0 heliove that the results obtained in this report would not have bei^h:^^^ 
essentially different if the estimates had been produced by another technique;|i^ 
say maximurii likelihood factor analysis, ■ 'mh 



Man has a limited conception of multidimensional phenomena, at least 
if they involve more than three dimensions* One may argue that it is rather 
meaningless to present descriptions more or less void of characteristic^ 
that can he exploited, even if they are of interest to statistical theory. (For 
Instance, I find the determinant of a covariance matrix much less under- 
standable than its trace.) It is not easy to see what kinds of comparisons 
will be the most fruitful ones to undertake in factor analysis, especially 
since simulated data have no empirical anchoring. I have chosen to focus 
the comparisons on some characteristics commonly used in factor analysis. 
The evaluation of the comparisons betv/een untransformed and transformed 
data has also been facilitated by a second generation of data (set II). Since 
all the comparisons made reem to tell the same story there are reasons to 
believe that other numerical comparisons would not have altered the results. 
But it would be valuable to supplement this report with parallel investigations 
on real data in order to elucidate the influence of transformations on the 
interpretation of factors, 

Larsson (1973) gives an example where a correlation has been considerably 
changed by monotonic transformations. Similar results have been obtained by 
others, e, g. Box & Cox (1 964) or Kruskal (1965). It is not easy to state under 
what conditions correlations can be changed much or little by monotonic 
transformations, but I suppose that there exist correlation matrices containin. 
several correlations which can be changed appreciably so that the factor 
structure will also change; I have no idea at all whether this happens 
frequently or not. We must not forget that I have chosen some rather common 
monotonic transformations independent of data. They are certainly not 
optimal in the sense that they change the factor structure as much as possible 
On the other hand, it may well be so that in many cases the maximal 
change is negligible. (Notice that the robustness of data to the rank transfor- 
mation does not imply robustness to any monotonic transformation. The 
rank transformation is dependent on the distribution, e, g* a rectangular 
distribution implies no change at all.) Therefore it is perhaps wisest to state 
a conditional conclusion: the monotonic transformations used show hardly ^^^^^^^^^^^^^^^^ ■ 
any non - robustness of factor analysis. 

If the results obtained in this study should occur often, methods 
like nonmetric and nonlinear factor analyses (see e.g. Lingoes k Guttman, 
1967, Carroll> 1972, and McDonald, 1962) would seldom be necessary* 
This is in line with what Shepard (1972, p* 37) sayj^ about his and Kruskal'^s 
nonmetric variety of factor analysis: , , . it has never been widely used. . . . 
the method tends-except in the case of extremely nonlinear data - to 
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yield representations that differ but little from those obtained by classical 
(linear} factor analysis.'^ 
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